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inmsuovssunesis the accuracy of the Wald.approximate 
fects. on limits of the Sequential Prebability Ratio Test 
(SPRT) are investigated. The Wald approximate limits are 
compared with "exact" emperical limits obtained by Monte 
Carlo simulation. An extensive bibliography of references 


aosOcltacted with the SPRY ts included, 


A AS ROR Me Oh enn 


In statistical theory the size of a sample may Crume 
not be fixed prior to observation of certain sample values. 
If, in a test of hypotheses, the sample size ais nov. esa 
in advance, the decision to ‘terminate sampling may dépend 
upon “the values’ of the previous samples :*" Such 2 tesa 
said to be sequential. 

The first mention of sequential test procedures wes 
by H. F. Dodge and H. G@. Romig who, in 1929, construcsoe 
a Double Sampling Plan. Prior to World War Ii, therewame 
not many entries in the literature concerning Sequenuuaam 
procedures. During World War II the Statistical Keseamed 
Group of Columbia University operated under a contract 
with the Office of Scientific Research and Developement 
and was directed by the Applied Mathematics Panel of the 
National Defense Research Committee. Milton Friedmanwand 
W. Allen Wallis, nenmene of the research group, recog@mamar 
the great potentialities and far reaching consequences 
that sequential analysis might have; consequently various 
members of the group, and in particular A. Wald, worked out 
what is known as the SPRT. In the early 1940's many of 
these results were classified, however, the Restricted 
classification was removed in 1945.’ 

Abraham Wald, in 1943, worked out the basic principles 
for the SPRT [Wald, 1943]. During the next two years Wald 


continued working on the basic principles of the SPRT, 
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ineluding a general consideration. of cumulative sums of 
independent random variables which gives the Operating 
Characteristic (OC) curve of any SPRT, and the character- 
jStie function of the number of observations required.by 
the test [Wald, 1944]. 

During this; Same period of time, independent work on 
sequential inferences was also conducted in England by 
G. A. Barnard [Barnard, 1946] who derived general results 
Similar to those obtained by the Statistical Research 
Group at Columbia. 

sinee the unfortunate death of A. Wald in 1950, there 
have been many varied contributions to. the! literature of 
sequential methods. These contributions tend to deal 
Peieriiy wien specific families of distributions. Many 


Sieune arbteles are listed in the bibliography. 


ata, 


Tif. THEORETICAL BACKGROUND OR iE omen 


The Sequential Probability Ratio Test of a simple Null 
Hypothesis against a simple Alternate Hypothesis differs 
from fixed sample size hypothesis tests am thay aay as 
conducted in stages, where a stage constitutes evaluation 
of an observation. At each: stage one of.three alternatives 
is chosen: .. (1). discontinue sampling, accept the null 
hypothesis; (2) discontinue sampling, reject the null 
hypothesis; (3) draw another observation. The procedure 
continues until one of the alternatives (1) or (2) is 
chosen. Under quite general “assumptions, the probaba dau, 


of eventual termination of the SPRT is equal to one.+ 


A. DEFINITION OF SPRI 

Let the distribution of the random variable, x. under 
consideration be given by the density or mass function, 
fC SOs ert Hy be the Null Hypothesis that @ = 6 and 


O 


Hy be the Alternate Hypothesis that 6 = 0, > Pas Therefore 
the density or mass’ function of oii ibemn axe e when 
Hy is) Grue andrei Gx: 61) when Hy is true. sSuccessive 


independent observations on X will be denoted X55 


pA Tepe Let 2c eee er . The SPRT is based on the likelihood ratio; 
2 n f(x, 5 8.) 
a Cx 6 
L=1 eee) 
lh 


A. Wald, Sequential Analysis, Wiley, N.Y., 1947, p.157. 
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Mie wewoOupOstotve mumbers A and B, A>1l and B<1l. After 
Saeneopservauton On X, the procedure for choosing one of 


the three alternatives is: 


pe at SB; Dusconbinue Sampling, Accept Ho 
2p) tt 1 2As Discontinue Sampling, Aceept Hy 


Spiny B<IL<As Draw another observation. 


Be DERIVATION OF STOPPING BOUNDS 
ihe Gwowconsvants A and B are determined so that the 

test will have (nearly) the prescribed probabilities, 
GeandtewOlemaking® errors, where a is the probability of 
making a Type I error (Rejecting Ho when it is true) and 
B is the probability of making a Type II error (Accepting 
Ho Wachetoets false). Exact values for A and B could, in 
principle, be obtained from the following equations given 


Ss 


Piewae tirst that Ho ns true and in the second that Hy 


true: 


e) 
I 


ieee eet Bl) <A a. 


DR 
i 


P[Il, <B] fe P[I,<B, B<Il, <A] Ea, fete, 6, fo, ols 


in praecrrce. approximations for A.and B developed by 
Wald are usually used, where a and 8 are specified apoiri.* 


The Wald approximate stopping bounds are 


Ww 
Il 
RBL 


ia 
I 
eR 


Pie Wald, Sequential Analysis, op.cit., pp. 40-44. 


ats) 


The boundaries used in the tests performed for this 
thesis were formulated as two diagonal lines with proper 
intercepts. > These boundaries are called the acceptance 
number, An? and the reject lon number. Rat These numbers 
are obtained by setting the logarithm of Ns the likeli- 
hood ratio, equal to the logarithms of A and B. The 
resulting test is the Wald SPRT with stopping bounds A 


and Bi. 


C. RANDOM WALK OF THE OBSERVATION RESULTS 

At each observation the value of the test statistic 
was tabulated. The stepwise values of the test stavisuues 
ean be graphed with the abscissa being the number, mom 
observations made and the ordinate being the value of the 
test statistic. |The boundaries, An and Ry will limiGgae 
steps of the random walk. The test terminates when either 
boundary is reached or surpassed by a value of the test 


Stat ister ex. 


D. THE OC FUNCTION OF THE SPRI 

The Operating Characteristic (OC) Function, L(6), is 
defined as the probability that the sequential test will 
lead to acceptance of Ho when @ is the true value of the 


parameter. Using the approximations on the stopping 


34, Wald, "Sequential Tests of Statistical Hypothesis," 
Annals of Math. Stat., Vol. 16, No, 2, 2945 sop. Te0=cen 
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bounds mentioned above, Wald sinomacl thaw the OC! funetion 


Gould be approximated by 


h( 6) 
. A — uli 
ah ph Is) 


where h(6@) is a non-negative real number such that 


co ae 81) h(@) 
f eso)" TTR (55¢ BASING fae a a (2) 


—oo 


Or the equivalent’ summation in the case f is a mass function. 
tmetsomsonouames Gditficult.in practice to obtain the 

value for h(@) for each of various given values of 0, in 

such cases a "reverse" process may be used: set h(6@) 

equal to a non-zero real number and compute a corresponding 

value of 6. This technique was used in computing L(@) for 


Bicweamonmval, and Bxpenential distributions in this thesis. 


Pele EXPECTED SAMPLE SIZE OF THE SPRT 

As smengtoned above, with probability one the SPRT 
eventually terminates. Thus, using the approximate boun- 
daries, A and B, and disregarding the "excess" of IL, over 


these boundaries at termination, 


P'{Ln B 


Ze |e] Te) 


Piinm: A= Z| 6] 1 = L(6) 


we Wald, Sequential Analysis, op.cit., pp. 48-52, 
161-64. 


i 


N N 

where Z = <x Ln T, = Zs and where in turn N is the 
i=l i=1 

sample size required for termination. ‘Therefore, the 


conditional expected value of Z, given 8, is approximately 
EZ | 6) = [2b Ceo irae 


Utilizing Wald's Fundamental Identity,- the condi treme 
expected value of Z, given Gi, Can pe wwelncen 
N 
E{Z |¢@) = El % Zo een = ey Nee lod 
= ~ 1 


al 


The Expected Sample Size for the test) is therefore) cima 


approximately by 


~ BIZ) 6. 2 oir Ge) iia ASeaLWe) ira 
BN = eer ee 


et E(Z, | 6] = 0, one may approximate the Expected Sample 


Size by 


Sites mle ds 


2 
E[Z, li 6a] 
F, VARIANCE OF SAMPLE SIZE OF THE SPRT. 


It would appear that the Variance of the Sample Size, 


(N), could be approximated in certain cases using an 


& 
“A. Wald, Sequential Analysis, op.cit., pp. 159-60; 


6 

“A. Wald, "Differentiation under the Expectation Sign in 
the Fundamental Identity of Sequential Analysis," Annals of 
Math. Stat., Vol. 17, Now 4, pe Ave, vecenmensiowcn 
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approach similar to that for E[N]. During his literature 
review the author encountered only three references to 
Such an approximation [Wald, 1945a], [Walker, 1950], 

[Cox and Roseberry, 1966a]. 


Since 
VIN] = E[N¢] - E°[N] (4) 


an approximation for V[N] may be developed as follows: 
E° (N] can be approximated using Eq. (3). An approximate 


value for the EN] may be derived from 


5 N 
Eien ade a2 Bil G2 


Utilizing Wald's Fundamental Identity, this may be written 
2 x 2 2 
Baar 40)" = E[NJE[Z, Peis 2 INCN=d eo 25) 2) 


Where sais berore’, Z; and Zi are’ independent and identically 


distributed. Expanding and collecting terms 


E[Z°| 6] 


EIN](E[Z,°| @]-E°(Z,| 01) + EIN°1E°(Z, [61 


E(Z°| 0] = E(N] V(Z,| 0] + E{N°) E°(z,| 0) 


Then 


2 
PZ Clee NEN. Vel Za | 8] 
E[N°] = SS : (5) 
E [Z, | 0] 
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with 
6 2 2 
EfZ | 6) = “=e 2 a iGo) ines 
Using Eqs. (3) and (5) in (4), am approximate Varuance 
of the, Sample Size as chusme vena 


E(Z-| 6) = BONIM)Zy |e) = [2 [eal 


VIN] 
E°[Z,| 61 


V(zZ| 6] - E[N] VIZ, | 6] 


V[N] (Gop) 


Bear 
Unfortunately, using Eq. (6) to caleulate the apprexm 

mate Variance of Sample Size leads to negative values aa 
many cases. This appears to be caused by the magnificaczem 
of the approximations used in Eq. (3) when they are entered 
in Eq. (5). Whether this is in fact the true cause. tne 
approximation in general is; not good. Consequently, ne 
numerical tabulations of the V[N]using Eq. (6) are included. 
However, emperically determined "exact" values of the V[N] 


are included. 
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IV. DISTRIBUTIONS INVESTIGATED 


The auther investigated the SPRT for three common 
Geetripurr ons: the Binomial, Normal, and Exponential. 
Bach digtribution was investigated for various parameter 
Values. Each specific distribution was used in calcu- 
lating the Wald "approximate" results and in generating 
empirical "exact" results for the OC curve, Expected 
sample Size, and the Variance of Sample Size. 

Explicit equations for the points on the Wald "“approx- 
imate" Operating Characteristic Curve, the "approximate" 
Expected Sample Sige Curve, and the "approximate" Variance 
of Sample Size Curve are given for these distributions using 


Bae, (1)5 (3), and (6): 


A, BINOMIAL DISTRIBUTION 


x i 


Suppese f(x; 6) =.0>- Cle) the logarithm of the 


likelihood ratio is 


em bouro 4 ica 
ay = tn (gh 4 
(@) O 


i 
Then 
F(1l, 8,) f£(0, 8,) 
E{Z, | @].=--6Ln eCIyrey + (1-6) Ln FO, 6.) 
8,7 I-65 
E[Z,| 6] = ae + (1-6) bal 552 CF) 
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and 


VIZ, | 8) = BIZs* (ere Be eeliven 
st =o 
| bears 3 2 | (8) 
Viz. oes eae a 
i a. cI=7) 


In order to obtain an approximate OC functions the 


expression of Eq. (2) for? tas mcase. 


8, he) mei ee 
S) ae. + (1-6) (18,7) = 1 5 


oO 
@) 


was solved for 6 and evaluated with various selected values 


Oey ave ee 


ee foe) Tae Ry oe 


Utilizing Eq. (9) with Eq. (1) points on the Wald approxi- 
mate OC Curve were obtained. 

By substituting Eq. (7) into Eq. (3) the Wald appraise 
mate Sample Size curve was determined. An attempt was 
made to determine the approximate Variance of Sample Size, 
utilizing Eqs. (7) and (8) with the prior results ote aae 
ra Heian COok 

The acceptance and rejection numbers were respectively 


computed from 


a 6 1-6, 
Ln — - Ln 
05 Ee) 
and 

se) 
le Ae te fain Gace. 

Rn = cael 

6 (1-8,) 


B. NORMAL DISTRIBUTION 


iiomocme) is Normal (6, Bo) then the log likelihood 


PENG ILO) SLs 
ae ee 2 2 
ioe ae [2Cc= 85x + Coe -8, Dili 
oO 
and 
cia ee acon =e) 
a) Al Leo 
E(Z,| 6] = + (Loy) 
aL 2 
2O Oo 
and 
8 
Vez.) 6) = Gils) 
a of 


Pupsuiguume Gg. (1.0) into Eq: (3) points on the Wald 
approximate Sample Size Curve were computed. Eqs. (10) and 
(11) with E[N] computed above were used in Eq. (6) in an 
attempt to determine the approximate Variance of N. 

In order to determine points on the approximate OC 
curve, the expression of Eq. (2) for the present case was 
Solved cor mCe), resulting in 
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The acceptance and rejection numbers “are, in thilsmeoaem 


respectively given by 


A tele 
n 
and 
R_ = fie 
n 
Ca SE XPONEN AIAN 
lie alee 
is 
La = hia 
ci 
so sche 
E(Z, | 6] 
and 
VIZ, | 0] 


SUDStLtUtIneE. Hae. 


approximate Sample Size curve were obtained. 


were used in Eq. 


mate 


0 
O 


Z 


wea) 


/(8,-8,)]Ln Bi eyeau ] 


Pena 
/(8,-8,)]Ln Ak Wil ceares ° 


DiESTREBUREON 
0) = 0 — x>0, the log likelihood ratio 
IS) 
il 
ices (0,-9,)X4 
O 
9 (@.-6_) 
is dell PO 
= Ika 5 — 6 (12) 
O 
Ghee )e 
= oes. ~ (13) 
9 


(12) into Eq. (3), points on the Wadd 
Eqs. (12) andaGigy 


(6) in an attempt to determine an approxi- 


Variance of N. 
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in erder to determine values for an approximate OC 


curve, Bq. (2) was solved for 6, resulting in 


nC G60) 6) 
a ii gge’ 
aa. ae 


pie 
ee eral 
[804 


Wotiizine Eos. (1) and (14) points on the Wald approximate 


OC Curve were obtained. 


The acceptance and rejection numbers are respectively 


Biven by 
oF 
lia, 18) ae) saulzral aul 
A = J ee eee) 
n 
and 8) 
ia Al -saalon Cp 
R = = as ay eet} 
n 
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V. PROCEDURE 


Computer programs, coded in FORTRAN IV, were written 
which gives points on the Wald approximate OC Curves, and 
the approximate Expected Sample Size curves, Frograms 
for the Binomial, Normal, and Exponential distributions 
each entitled "Approximate (Distribution)" are listed under 
Computer Programs. 

In order to evaluate the Wald approximations, companion 
programs called "exact" programs were written. These 
programs produce Monte Carlo simulation of the SPRT pro- 
cedure for the distribution comsidercds | Each ipreoeran 
determines "exact" points on the OC curve, Expected Sample 
Size curve and Variance of Sample Size curve. 

The simulation models sequential sampling from a 
specific known distribution with parameter values being 
inputs to the simulation. As each sample was observed, a2 
corresponding acceptance and rejection number was computed 
and the test statistic was computed. These values weme 


compared and the appropriate alternative was selected, 


A. DETERMINATION OF THE NUMBER OF REPLICATIONS 

In order for the simulation estimates of the Operating 
Characteristic points, CaN (a Bernoulli parameter) to be 
useful, it is necessary to use a large number of replica= 
tions (that is, simulate the performance of many tests) so 


that with a high probability L(6) is "close" tothe true 


24 


fore Norneal approximation to the binomial was used to 
determine the number of replications that should be used 


eEeeacn sample point. This was done as follows: 


Let S. denote the number of successes in J independent 


J 
repeated Bernoulli trials with parameter p, and let 


Genece ache average number of successes in'J trials. For 
large J, Py can be shown to be approximately Normal with 
mean p and variance p(l-p)/J. Then Y= (25-p)/ tp (1-p)7ai- 
is approximately Normal (0, 1). ° For any level of risk, 
y-0, and minimum acceptable probability bound, 6>0, we 
seek J such that 


1g 


JE) Y 
PIPs-p] = P| |————_—— | c« ———____ | > 6 
Yp(1-p)/J Yp(1=-p)/d 
or 
P iene ae vd Y5 S aia A ee ia > 6 
Ve=p) /e (=p) | 
This occurs whenever 
2 
Y 
: 1-6 : ; 
where ys is such that Pv <=y<] aio Rane where Y is dis- 
2 
Beioiredeiormal (0) 1). ° At p = i=p/= 1/2, 7 > a SO 
CY 
Homey — s0lvancdeo = .95. Y= 1.645 and the required 


value of J is approximately 6, (65.0) ddutaomenniy: 


Dees Nore ema eles 6 = 95 Regu wes) ld) a= alee ales 
OS "sO 3 ay ae 6 = .95 requires J = 163; 
De Vaal, ye = rere 6 = .95 requires) J = 2095. 
p= aoe yF= F301, 6 = .95 requires J = 4889. 


In view of these and similar deverminations and) sinie¢emems 
was desired to estimate L(@) values as large as .30 to 

within reasonable ‘accuracy., a selection tof 5000 Vrepisiiees 
tions for each estimation of L(@) value was made. The 

values obtained in the simulation should therefore, wise 
high probability, be accurate to at MWeast 25 andiusuaia 
3, decimal places. A Pourth decimaliiwas: Carriede tiem. 


tables to exclude: possible wound ois ferrecsr 


B. COMPUTER SIMULATION 

For each distribution investigated two computer prosram. 
were developed; an "approximate program" based on the Wald 
approximations for the SPRE, andithe Vexactyprocrmamnls 
involving a Monte Carlo simulation. The inputs necessary, 
for these programs are the number of Gistri butions ome 
inspected, the parameter values for the Null and Altera 
Hypotheses Sle and 05); the "target" Type I and Type II 
‘rror Probabilities (ao and 6), parameters for the dis teams 

being sampled (60), arguments for the random number 

‘enerator (URN, a special random number generator included 

the IBM Scientific Package at the Naval Postgraduate 


1), the number of replications (in this thesis 5000) 
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Bomeoeenn test point, (0), and the number of test points 
from the "approximate program" being used in the "exact 
program." These values, punched in one input ecard, and 
the associated output from the "approximate program," 
Times run hirsc , comprise the input data for: the exact 
program. 

piamulation of observations from Binomial, Normal, and 
Exponential distributions were by standard methods. Back- 
ground information may be obtained in [Naylor, 1967] and 
[McMillan and Gonzalez, 1968]. 

iimesdem ve Obvarn the values tabulated in this thesis 
the procedure was to first run an "approximate program." 
these programs produce a fixed number of test points. In 
the Approximate Binomial program the number of test points 
(now 85) is controlled by the last "IF" statement in the 
program, varying the value 41 changes the number of test 
Permes. Control of the number of test points (80) in the 
Approximate Normal program is by changing the numerical 
fitiearpinercs in the first. "XM" and "XM" steps. «Control 
of the number of test points (50) in the Approximate Expo- 
envi program is by changing the limit on the "DO 7" 
statement. The presentation of the Approximate Program 
output (Approximate OC Value, 8, Approximate Expected 
Sample Size) is a listing and a separate punched card for 
Poche veSUepOlMMt. 6. | Prior to running the “exact program," 


Pe AUMNDersOL BeSt points was reduced to 15 to 20 so as to 


ea 


reduce the execution time. The Exact Program output (see 
tables) is a listing andva Seu sor munchedweaicdc. 

"Exact" OC values of various SPRI"s were esvimatvedwa, 
the relative frequency of the number of replications ten. 
Minatving waiuh accept ance sor Ho» at each of several values 
on ¢'. 

"Exact" expected sample sizes at each test point, 6, 
were estimated by tallying, each time Che test was performed: 
the observed sample requirement. The average sample size 
being computed at the end of the 5000 replications. These 
averages were taken to be the exact values of E[N]. 

The "exact" variance of sample size at each test point, 
@, was computed in a manner similar to vhav used foregme 


EIN], 


n 
[ cane? = OTN 


where n = 5000 replications and ms denotes the number of 


observations required in the ith simulated test. 


C. COMPUTER STAMISTiCS:. 

The IBM-360 at the Naval Postgraduate School was 
utilized in computing the values presented. The maximum 
core space needed for any one of the programs was less 
than 58K bytes. The necessary time to complete an 
"approximate" case was less than 5 seconds. The "exact" 


programs were run under the H compiler of the IBM-360. In 
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Seorvaim cases tnas allLowed.-an-execution time savings of 
80% with respect to running under the G compiler. The 
running time per case for the programs averaged 11 minutes 


Bo presenved, with a maximum of about 15 minutes. 
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VI. CONCLUSIONS AND RESULTS 


A comparison of the Wald "approximate" results with 
the "exact" results described above confirmed the known 
fact that, in all cases, the results obtained using the 
Wald approximations for the SPRT are conservative in that 
a given test plan's error probabilities are greater than 
the exact values. In the Binomial cases as much as a 
22% difference in the target a level and the exact a 
level was noted; for the Exponential distribution in one 
case (Hy = dO, i, = 5.0; a= .03, 8 = .05) essen 
difference was noted, with the differences generally 
averaging about 25 to 30%. For the Normal distribution 
the percentage difference in the target a level and the 
exact a level appears to increase with |e, - 6, |. 

A comparison of the approximate and exact expected 
Sample sizes is facilitated by the tabilized values. In 
all cases the results following indicate that at each test 
point the "exact" expected sample size is larger than the 
Wald approximation. 

The exact variances of sample size found here Support 
the conjecture of [Cox and Roseberry, 1966a], namely, 
that V[N] is approximately the square of E[N]. Unfor- 
tunately, as noted earlier a natural approximation for 
V{N] Benes out to yield especially poor approximations. 
However, the exact variance of N does appear to increase 


roughly as the square of E[N], in the cases considered here. 
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mae rorevaSex Cases were examined. The final 


results of all cases are tabulated below. 
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